NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Confidently Comparing Estimates with the c-value

https://doi.org/10.1080/01621459.2022.2153688

Trippe, Brian L.; Deshpande, Sameer K.; Broderick, Tamara (February 2023, Journal of the American Statistical Association)

Full Text Available
Gaussian processes at the Helm(holtz): A more fluid model for ocean currents

Berlinghieri, Renato; Trippe, Brian L; Burt, David R; Giordano, Ryan James; Srinivasan, Kaushik; Ozgokmen, Tamay; Xia, Junfei; Broderick, Tamara (July 2023, Proceedings of Machine Learning Research)

Full Text Available
LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations

Trippe, Brian; Huggins, Jonathan; Agrawal, Raj; Broderick, Tamara (June 2019, Proceedings of Machine Learning Research)

Due to the ease of modern data collection, applied statisticians often have access to a large set of covariates that they wish to relate to some observed outcome. Generalized linear models (GLMs) offer a particularly interpretable framework for such an analysis. In these high-dimensional problems, the number of covariates is often large relative to the number of observations, so we face non-trivial inferential uncertainty; a Bayesian approach allows coherent quantification of this uncertainty. Unfortunately, existing methods for Bayesian inference in GLMs require running times roughly cubic in parameter dimension, and so are limited to settings with at most tens of thousand parameters. We propose to reduce time and memory costs with a low-rank approximation of the data in an approach we call LR-GLM. When used with the Laplace approximation or Markov chain Monte Carlo, LR-GLM provides a full Bayesian posterior approximation and admits running times reduced by a full factor of the parameter dimension. We rigorously establish the quality of our approximation and show how the choice of rank allows a tunable computational–statistical trade-off. Experiments support our theory and demonstrate the efficacy of LR-GLM on real large-scale datasets.
more » « less
Full Text Available
The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions

Agrawal, Raj; Trippe, Brian; Huggins, Jonathan; Broderick, Tamara (June 2019, Proceedings of Machine Learning Research)

Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate- dimensional problems. Our key insight is that many hierarchical models of practical interest admit a Gaussian process representation such that rather than maintaining a posterior over all O(p^2) interactions, we need only maintain a vector of O(p) kernel hyper-parameters. This implicit representation allows us to run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.
more » « less
Full Text Available

Search for: All records